18 research outputs found

    PER-MARE: Adaptive Deployment of MapReduce over Pervasive Grids

    No full text
    International audienceMapReduce is a parallel programming paradigm successfully used to perform computations on massive amounts of data, being widely deployed on clusters, grid, and cloud infrastructures. Interestingly, while the emergence of cloud in- frastructures has opened new perspectives, several enterprises hesitate to put sensible data on the cloud and prefer to rely on internal resources. In this paper we introduce the PER- MARE initiative, which aims at proposing scalable techniques to support existent MapReduce data-intensive applications in the context of loosely coupled networks such as pervasive and desktop grids. By relying on the MapReduce programming model, PER-MARE proposes to explore the potential advan- tages of using free unused resources available at enterprises as pervasive grids, alone or in a hybrid environment. This paper presents the main lines that orient the PER-MARE approach and some preliminary results

    Cloud computing with Google Apps for education: An experience report

    Get PDF
    This article presents an experience report on using Google Apps for Education in a computer science laboratory at the Federal University of Santa Maria, Brazil. Google Apps platform offers a range of applications in a SaaS (Software as a Service) cloud, bringing several facilities for members of the institution, but also some challenges for system administrators. Throughout the article, we describe the migration to the cloud platform, the current state of the migrated domain and some opportunities that are being explored to best meet our requirements.Key words: cloud computing, Software as a Service, Google Apps, system administration

    MAPREDUCE CHALLENGES ON PERVASIVE GRIDS

    No full text
    International audienceThis study presents the advances on designing and implementing scalable techniques to support the development and execution of MapReduce application in pervasive distributed computing infrastructures, in the context of the PER-MARE project. A pervasive framework for MapReduce applications is very useful in practice, especially in those scientific, enterprises and educational centers which have many unused or underused computing resources, which can be fully exploited to solve relevant problems that demand large computing power, such as scientific computing applications, big data processing, etc. In this study, we pro-pose the study of multiple techniques to support volatility and heterogeneity on MapReduce, by applying two complementary approaches: Improving the Apache Hadoop middleware by including context-awareness and fault-tolerance features; and providing an alternative pervasive grid implementation, fully adapted to dynamic environments. The main design and implementation decisions for both alternatives are described and validated through experiments, demonstrating that our approaches provide high reliability when executing on pervasive environments. The analysis of the experiments also leads to several insights on the requirements and constraints from dynamic and volatile systems, reinforcing the importance of context-aware information and advanced fault-tolerance features to provide efficient and reliable MapReduce services on pervasive grids

    Project of the kernel of a distributed operating system

    No full text
    Uma das tendências para o aumento do desempenho dos sistemas de computação atuais tem sido a distribuição do processamento em uma rede de computadores. Já foram pesquisados diversos modelos para obter essa distribuição, e um dos que tem se mostrado mais promissor é aquele no qual o controle da distribuição é efetuado diretamente pelo sistema operacional. Um sistema operacional desse tipo é chamado de sistema operacional distribuído[TAN85], e seu principal objetivo e fornecer a seus usuários a ilusão de uma maquina uniprocessadora constituída pela soma dos recursos oferecidos pelos componentes da rede. A forma de realizar tal ilusão é o sistema operacional controlar a utilização dos recursos distribuídos para o usuário, independentemente de onde estejam localizados, a medida que sejam requisitados e estejam disponíveis. Esta sendo desenvolvido no CPGCC da UFRGS o projeto DIX, cujo objetivo é o desenvolvimento de um Sistema Operacional Distribuído. Para o desenvolvimento desse projeto, foi tornado como base o sistema operacional MINIX. As principais razoes dessa opção foram: o alto grau de modularidade do MINIX, a utilização do paradigma de troca de mensagens para comunicação entre processos e a sua disponibilidade. A plataforma de hardware inicial para o desenvolvimento do projeto é um grupo de estações de trabalho Proceda. Tais estações caracterizam-se por possuir internamente dois elementos processadores distintos. O projeto DIX teve inicio com o porte do sistema operacional MINIX para o ambiente multiprocessador heterogêneo das estações. Devido a necessidade de comunicação entre as estações e a indisponibilidade de hardware adequado para tal, foi desenvolvida uma forma alternativa de comunicação, baseada na utilização da interface paralela existente nas estações. Este trabalho descreve o núcleo do sistema operacional. A filosofia adotada foi torná-lo o mais simples possível, colocando em processos servidores, externos ao núcleo, grande parte das tarefas. Outro objetivo foi alterar o mínimo possível a interface original do MINIX, para que as camadas superiores do sistema continuassem em funcionamento. Dessa forma, a principal função do núcleo é fornecer aos processos mecanismos para troca de mensagens e transferência de dados entre processos. Foi desenvolvido um método para a identificação global dos processos, que permite identificar cada processo do sistema de forma unívoca e um mecanismo de comunicação entre processos que suporta transparência de localidade, migração de processos e falhas em nodos da rede.One of the modern trends in Computer Science has been the use of distribution to improve system performance. Many models of distribution have been proposed, and the most promising one is that in which the distribution is directly controlled by the operating system. Such type of system is called a distributed operating system[TAN85], and its main goal is to provide its users an illusion of an uniprocessor system more powerful than its components. The operating system controls the utilization of the distributed resources in a transparent way, in order to present such illusion to its users. There is a project, named DIX, under development at CPGCC/UFRGS, whose goal is to gather experience in the field while developing a distributed operating system. The MINIX operating system has been chosen as a software basis for the project, because of its high degree of modularity, its message passing IPC paradigm and the availability of its source code. The initial hardware configuration is a set of Proceda workstations. Those workstations have two distincts processors that can run in parallel. The project was started with the porting of MINIX to the workstations' heterogeneous multiprocessor environment. Due to the need of information exchange among the workstations and to the unavailability of suitable communication hardware, an alternative communication scheme was developed. This work describes the kernel of the operating system. The adopted methodology was to keep it as simple as possible, putting a great number of tasks in server processes outside the kernel. Another goal was to preserve the MINIX original interface, so that the upper layers of the system could remain functional. So, the main purpose of the kernel is to supply an efficient message exchange mechanism. That mechanism supports locality transparency: the sender of a message is not aware of the destination location, and it is even possible that processes migrate. A method has been developed for the global unique identification of processes

    Visualisation interactive et extensible de programmes parallèles à base de processus légers

    No full text
    The research described in this dissertation was performed within the APACHE team (sponsored by CNRS, INPG, INRIA and UJF) whose aim is to study all the issues raised by combining efficiency and portability in the implementation of irregular applications. The work done by the APACHE project is integrated in the Athapascan software environment. In Athapascan, performance debugging is based on software tracing of the executions of parallel applications, followed by trace analysis and visualisation of the traced executions. The aim of this thesis was to provide programmers with a visualisation tool helping them to identify the performance errors of their programs by providing them with the clearest possible representation of the execution of these programs. The most important claim of this thesis is the design and implementation of a visualisation tool called Pajé combining the three most important characteristics of such visualisations tools : extensibility, interactivity and scalability. Extensibility is necessary to cope with the lack of parallel programming standard and ease the implementation of non foreseen visualizations in Pajé. It is supported by the architecture of Pajé as a graph of generic components with clearly defined communication protocols. Interactivity allows programmers to control the visualisation by moving in time or inspecting the contents of the displayed objects. To limit the amount of data that need to be managed in memory to implement interactivity, a data structure called observation window was defined together with the algorithms to move it efficiently in time. Scalability is related to the possibility of representing a potentially important number of graphical objects (threads, communications, tasks, etc...) evolving dynamically. It is mainly supported by allowing the visualisation at several levels of abstraction, such that moving from one level to another simulates zooming.Cette thèse s'est déroulée au sein du projet APACHE (CNRS-INPG-INRIA-UJF) dont l'objectif est l'étude de l'ensemble des aspects liés à la mise en oeuvre efficace et portable d'applications irrégulières et dont les études sont concrétisées par l'environnement Athapascan. Dans l'environnement Athapascan le «débogage pour les performances» est basé sur le traçage logiciel des exécutions des applications parallèles suivi de l'analyse des traces et de la visualisation des exécutions tracées. L'objectif de la thèse était de fournir aux programmeurs un outil de visualisation les aidant à identifier les "erreurs de performances" de leurs programmes en leur donnant une représentation aussi claire que possible de l'exécution de ces programmes. La principale contribution de la thèse est la conception et la réalisation d'un outil appelé Pajé combinant les trois propriétés essentielles d'interactivité, d'extensibilité et d'aptitude au passage à l'échelle. L'extensibilité permet de prendre en compte l'absence de stabilisation des modèles de programmation parallèles et d'offrir la possibilité d'ajouter à Pajé des visualisations non envisagées lors de sa conception. Elle est assurée par une architecture en graphe de modules génériques, communiquants par des protocoles bien spécifiés. L'interactivité donne au programmeur le contrôle sur la visualisation par des actions telles que déplacement dans le temps ou inspection du contenu des objets visualisés, etc. Pour limiter le volume de données qu'elle implique de conserver en mémoire, une structure de données appelée fenêtre de visualisation a été définie ainsi que les algorithmes permettant de la faire glisser efficacement dans le temps. L'aptitude au passage à l'échelle est liée à la capacité de représenter un nombre potentiellement important d'objets graphiques (processus légers, communications, tâches, etc...) évoluant dynamiquement. Elle est essentiellement assurée en facilitant la visualisation à différents niveaux d'abstraction, en sorte que le passage d'un niveau à un autre simule une action de zoom

    Project of the kernel of a distributed operating system

    No full text
    Uma das tendências para o aumento do desempenho dos sistemas de computação atuais tem sido a distribuição do processamento em uma rede de computadores. Já foram pesquisados diversos modelos para obter essa distribuição, e um dos que tem se mostrado mais promissor é aquele no qual o controle da distribuição é efetuado diretamente pelo sistema operacional. Um sistema operacional desse tipo é chamado de sistema operacional distribuído[TAN85], e seu principal objetivo e fornecer a seus usuários a ilusão de uma maquina uniprocessadora constituída pela soma dos recursos oferecidos pelos componentes da rede. A forma de realizar tal ilusão é o sistema operacional controlar a utilização dos recursos distribuídos para o usuário, independentemente de onde estejam localizados, a medida que sejam requisitados e estejam disponíveis. Esta sendo desenvolvido no CPGCC da UFRGS o projeto DIX, cujo objetivo é o desenvolvimento de um Sistema Operacional Distribuído. Para o desenvolvimento desse projeto, foi tornado como base o sistema operacional MINIX. As principais razoes dessa opção foram: o alto grau de modularidade do MINIX, a utilização do paradigma de troca de mensagens para comunicação entre processos e a sua disponibilidade. A plataforma de hardware inicial para o desenvolvimento do projeto é um grupo de estações de trabalho Proceda. Tais estações caracterizam-se por possuir internamente dois elementos processadores distintos. O projeto DIX teve inicio com o porte do sistema operacional MINIX para o ambiente multiprocessador heterogêneo das estações. Devido a necessidade de comunicação entre as estações e a indisponibilidade de hardware adequado para tal, foi desenvolvida uma forma alternativa de comunicação, baseada na utilização da interface paralela existente nas estações. Este trabalho descreve o núcleo do sistema operacional. A filosofia adotada foi torná-lo o mais simples possível, colocando em processos servidores, externos ao núcleo, grande parte das tarefas. Outro objetivo foi alterar o mínimo possível a interface original do MINIX, para que as camadas superiores do sistema continuassem em funcionamento. Dessa forma, a principal função do núcleo é fornecer aos processos mecanismos para troca de mensagens e transferência de dados entre processos. Foi desenvolvido um método para a identificação global dos processos, que permite identificar cada processo do sistema de forma unívoca e um mecanismo de comunicação entre processos que suporta transparência de localidade, migração de processos e falhas em nodos da rede.One of the modern trends in Computer Science has been the use of distribution to improve system performance. Many models of distribution have been proposed, and the most promising one is that in which the distribution is directly controlled by the operating system. Such type of system is called a distributed operating system[TAN85], and its main goal is to provide its users an illusion of an uniprocessor system more powerful than its components. The operating system controls the utilization of the distributed resources in a transparent way, in order to present such illusion to its users. There is a project, named DIX, under development at CPGCC/UFRGS, whose goal is to gather experience in the field while developing a distributed operating system. The MINIX operating system has been chosen as a software basis for the project, because of its high degree of modularity, its message passing IPC paradigm and the availability of its source code. The initial hardware configuration is a set of Proceda workstations. Those workstations have two distincts processors that can run in parallel. The project was started with the porting of MINIX to the workstations' heterogeneous multiprocessor environment. Due to the need of information exchange among the workstations and to the unavailability of suitable communication hardware, an alternative communication scheme was developed. This work describes the kernel of the operating system. The adopted methodology was to keep it as simple as possible, putting a great number of tasks in server processes outside the kernel. Another goal was to preserve the MINIX original interface, so that the upper layers of the system could remain functional. So, the main purpose of the kernel is to supply an efficient message exchange mechanism. That mechanism supports locality transparency: the sender of a message is not aware of the destination location, and it is even possible that processes migrate. A method has been developed for the global unique identification of processes

    Pajé, an Extensible and Interactive and Scalable Environment for Visualizing Parallel Program Executions

    No full text
    This report describes Pajé, an interactive visualization tool for displaying the execution of parallel applications where a (potentially) large number of communicating threads of various life-times execute on each node of a distributed memory parallel system. Pajé is capable of representing a wide variety of interactions between threads. The main novelty of Pajé is an original combination of three of the most desirable properties of visualization tools for parallel programs: extensibility, interactivity and scalability. Interactivity gives the possibility to inspect all the objects displayed in the current screen and to move back and forth in time. Scalability is the ability to cope with long computations involving a large number of threads. Extensibility gives the possibility to extend easily the environment with new functionalities. The interactivity and scalability of Pajé are exemplified by the performance tuning of a molecular dynamics application. To be easier to extend, Pajé was designed as a data-flow graph of modular components, most of them being independent of the semantics of the programming model of the visualized parallel programs. In addition, the genericity of the main components of Pajé allow application programmers to adapt the visualization to their needs, without having to know any insight nor to modify any component of Pajé

    Pajé, an Extensible and Interactive and Scalable Environment for Visualizing Parallel Program Executions

    Get PDF
    This report describes Pajé, an interactive visualization tool for displaying the execution of parallel applications where a (potentially) large number of communicating threads of various life-times execute on each node of a distributed memory parallel system. Pajé is capable of representing a wide variety of interactions between threads. The main novelty of Pajé is an original combination of three of the most desirable properties of visualization tools for parallel programs: extensibility, interactivity and scalability. Interactivity gives the possibility to inspect all the objects displayed in the current screen and to move back and forth in time. Scalability is the ability to cope with long computations involving a large number of threads. Extensibility gives the possibility to extend easily the environment with new functionalities. The interactivity and scalability of Pajé are exemplified by the performance tuning of a molecular dynamics application. To be easier to extend, Pajé was designed as a data-flow graph of modular components, most of them being independent of the semantics of the programming model of the visualized parallel programs. In addition, the genericity of the main components of Pajé allow application programmers to adapt the visualization to their needs, without having to know any insight nor to modify any component of Pajé
    corecore